WHAT IS CLAIMED IS: 



1. A method for the directional subcloning of DNA fragments comprising: 

a) providing a first vector comprising a first selectable marker gene and a 
5 DNA sequence of interest, which DNA sequence of interest is flanked by at least 

two restriction enzyme sites, wherein at least one of the flanking restriction enzyme 
sites is a site for a first restriction enzyme which has infirequent restriction sites in 
cDNAs or open reading frames firom at least one species and generates 
complementary single-strand DNA overhangs, wherein at least one of the flanking 

10 restriction enzyme sites is for a second restriction enzyme which has infi:-equent 
restriction sites in cDNAs or open reading fi-ames fi-om at least one species and 
generates ends that are not complementary to the overhangs generated by the first 
restriction enzyme, wherein digestion of the first vector with the first restriction 
enzyme and the second restriction enzyme site generates a first linear DNA 

15 fi-agment which lacks the first selectable marker gene but comprises the DNA 
sequence of interest; 

b) providing a second vector comprising a second selectable marker gene 
which is distinguishable fi"om the first selectable marker gene and non-essential 
DNA sequences, optionally including a counterselectable gene, which non-essential 

20 sequences are flanked by at least two restriction enzymes sites, wherein at least one 
of the flanking restriction enzyme sites in the second vector is for a third restriction 
enzyme which generates complementary single-strand DNA overhangs that are 
complementary to the single-strand DNA overhang generated by the first restriction 
enzyme in the first linear DNA fi-agment, wherein at least one of the flanking 

25 restriction sites in the second vector is for a fourth restriction enzyme which 

generates ends that are not complementary to the ends generated by the first or third 
restriction enzyme but can be ligated to the ends generated by the second restriction 
enzyme, and wherein digestion of the second vector with the third restriction 
enzyme and the fourth restriction enzyme generates a second linear DNA fi-agment 

30 which lacks non-essential DNA sequences but comprises the second selectable 
marker, which second linear DNA firagment is flanked by ends which permit the 
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oriented joining of the first linear DNA fragment to the second Unear DNA 
fragment; and 

c) combining the first and second vectors, the first vector and the second 
linear DNA fi-agment, or the second vector and the first Unear DNA fragment in a 
5 suitable buffer with one or more restriction enzymes and optionally DNA ligase 
under conditions effective to result in digestion and optionally ligation to yield a 
mixture optionally comprising a third vector comprising the first and second linear 
DNA molecules which are joined in an oriented manner, 

10 2. The method of claim 1 wherein the second restriction enzyme generates 

blunt ends and the first linear DNA fi-agment is flanked by a first single-strand DNA 
overhang and a blunt end. 

3. The method of claim 1 wherein the first and third restriction enzymes are not 
15 the same. 

4. The method of claim 1 wherein the second and fourth restriction enzymes 
are not the same. 

20 5. The method of claim 1 wherein the second and fourth restriction enzymes 
generate blunt ends. 

6. The method of claim 1 wherein the first restriction enzyme is Sgfl. 

25 7. The method of claim 6 wherein the second restriction enzyme is Pmel. 

8. The method of claim 1 wherein the third restriction enzyme generates a 3' 
TA overhang. 

30 9. The method of claim 8 wherein the third restriction enzyme is Pvul or PacL 
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10. The method of claim 1 wherein the DNA sequence of interest comprises an 
open reading frame comprising one or more sites for the first or second restriction 
enzyme. 

5 11. The method of claim 10 wherein prior to digestion with the one or more 
restriction enzymes, the sites for the one or more restriction enzymes in the open 
reading frame are protected so as to prevent digestion. 

12. The method of claim 1 1 wherein the sites are protected by methylation. 

10 

13. The method of claim 12 wherein prior to methylation the flanking sites for 
the first or second restriction enzyme are contacted with an oligonucleotide 
complementary to the flanking restriction enzyme site and RecA. 

15 14. A vector system for cloning comprising: 

a first vector comprising a first selectable marker gene and a DNA sequence 
of interest, which DNA sequence of interest is flanked by at least two restriction 
enzyme sites, wherein at least one of the flanking restriction enzyme sites is a site 
for a first restriction enzyme which has infrequent restriction sites in cDNAs or 

20 open reading frames from at least one species and generates complementary single- 
strand DNA overhangs, wherein at least one of the flanking restriction enzyme sites 
is for a second restriction enzyme which has infrequent restriction sites in cDNAs or 
open reading frames from at least one species and generates ends that are not 
complementary to the overhangs generated by the first restriction enzyme, wherein 

25 digestion of the first vector generates a first linear DNA fragment which lacks the 
first selectable marker gene but comprises the DNA sequence of interest, wherein 
the restriction enzyme sites are designed such that the first linear DNA fragment can 
be religated directly to a second vector comprising a second selectable marker gene 
which is distinguishable from the first selectable marker gene and non-essential 

30 DNA sequences, optionally including a counterselectable gene, which non-essential 
DNA sequences are flanked by at least two restriction enzymes sites, wherein at 
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least one of the flanking restriction enzyme sites in the second vector is for a third 
restriction enzyme which generates complementary single-strand DNA overhangs 
which are complementary to the single-strand DNA overhangs generated by the first 
restriction enzyme, wherein at least one of the flanking restriction sites in the second 
5 vector is for a fourth restriction enzyme which generates ends that are not 

complementary to the ends generated by the first or third restriction enzyme but can 
be ligated to the ends generated by the second restriction enzyme, wherein digestion 
of the second vector with the third and fourth restriction enzymes generates a 
second linear DNA fragment which lacks the non-essential DNA sequences but 
10 comprises the second selectable marker gene, wherein the second linear DNA 
fragment is flanked by ends which permit the oriented joining of the first linear 
DNA fragment to the second linear DNA fi-agment. 

15. The vector system of claim 14 wherein the second restriction enzyme 

15 generates blunt ends and the first linear DNA fragment is flanked by a first single- 
strand DNA overhang and a blunt end. 

16. The vector system of claim 14 wherein the first and third restriction enzymes 
are not the same. 

20 

17. The vector system of claim 14 wherein the second and fourth restriction 
enzymes are not the same. 

18. The vector system of claim 14 wherein the second and fourth restriction 
25 enzymes generate blunt ends. 

19. The vector system of claim 14 wherein ligation and oriented jointing yields a 
third vector encoding a N-terminal fusion protein which is encoded by the DNA 
sequence of interest and nucleic acid sequences 5* to the 3' end of the second linear 

30 DNA fi-agment. 
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20. The vector system of claim 14 wherein ligation and oriented joining yields a 
third vector encoding a C-terminal fusion protein which is encoded by the DNA 
sequence of interest and nucleic acid sequences 3* to the 5' end of the second linear 
DNA fragment. 

5 

21. The vector system of claim 14 wherein ligation and oriented joining yields a 
third vector encoding a fusion protein which is encoded by the DNA sequence of 
interest and nucleic acid sequences 5' and 3' to the respective 3' and 5' end of the 
second linear DNA fragment. 

10 

22. The vector system of claim 14 wherein ligation and oriented joining yields a 
third vector encoding a fusion protein encoded by the DNA sequence of interest and 
the exchange site(s) created by the oriented joining. 

15 23. The vector system of claim 14 wherein one of the restriction enzymes is 
Aarl Ascl BbrCl, Cspl, Dral, Fsel, Notl, Nrul, Pad, Pmel, Pvul, Sapl, Sdal, Sfil, 
Sgfl, Spli, Srfl, Swal, or a restriction enzyme which has the same recognition site as 
Aarl Ascl BbrCl, Cspl Dral, Fsel, Notl, TVrwI, Pad, Pmel, Pvul, Sapl, Sdal, Sfil, 
Sgfl, Spa, Srfl, Swal. 

20 

24. The vector system of claim 20, 21 or 22 wherein the fusion protein is a GST 
fusion protein, GFP fusion protein, thioredoxin fusion protein, maltose binding 
protein fusion protein, protease cleavage site fusion protein, metal binding domain 
fusion protein or dehalogenase fusion protein. 

25 

25. The vector system of claim 20, 21 or 22 wherein the fusion protein is more 
soluble, easier to purify or easier to detect relative to the corresponding non-fusion 
protein. 

30 26. A kit comprising the vector system of claim 14. 
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27. A method for producing a vector suitable for expression of an amino acid 
sequence of interest, comprising: 

combining at least two vectors in a suitable buffer with one or more 
restriction enzymes and optionally DNA ligase under conditions effective to result 
5 in digestion and optionally ligation to yield a mixture optionally comprising a third 
vector, wherein a first vector comprises a first selectable marker gene and a DNA 
sequence of interest, which DNA sequence of interest is flanked by at least two 
restriction enzyme sites, wherein two or more of the flanking restriction enzyme 
sites are sites for a first restriction enzyme which is a hapaxoterministic restriction 

10 enzyme, wherein digestion of the first vector with the first restriction enzyme 

generates a first Unear DNA fi-agment which lacks the first selectable marker gene 
but comprises the DNA sequence of interest and a first pair non-self complementary 
single-strand DNA overhangs, wherein a second vector comprises a second 
selectable marker gene which is distinguishable fi-om the first selectable marker 

1 5 gene and non-essential DNA sequences that optionally include a counterselectable 
gene, which non-essential DNA sequences are flanked by two or more restriction 
enzyme sites, wherein two or more of the flanking sites in the second vector are for 
a second restriction enzyme which is a hapaxoterministic restriction enzyme, 
wherein digestion of the second vector with the second restriction enzyme generates 

20 a second linear DNA fi*agment which lacks non-essential DNA sequences but 
comprises the second selectable marker gene and a second pair of non-self 
complementary single-strand DNA overhangs, wherein each of the second pair of 
the non-self-complementary DNA overhangs is complementary to only one of the 
single-strand DNA overhangs of the first pair of non-self complementary single- 

25 strand DNA overhangs and permits the oriented joining of the first linear DNA 
fi-agment to the second linear DNA fi-agment. 

28. A method for producing a vector suitable for expression of an amino acid 
sequence of interest, comprising: 

30 combining at least two vectors in a suitable buffer with one or more 

restriction enzymes and optionally DNA ligase under conditions effective to result 
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in digestion and optionally ligation to yield a mixture optionally comprising a third 
vector, wherein a first vector comprises a first selectable marker gene and a DNA 
sequence of interest, which DNA sequence of interest is flanked by at least two 
restriction enzyme sites, wherein at least one of the flanking restriction enzyme sites 
5 is a site for a first restriction enzyme which has infrequent restriction sites in cDNAs 
or open reading frames from at least one species and generates complementary 
single-strand DNA overhangs, wherein at least one of the flanking restriction 
enzyme sites is for a second restriction enzyme which has infrequent restriction sites 
in cDNAs or open reading frames from at least one species and generates ends that 

10 are not complementary to the overhangs generated by the first restriction enzyme, 
wherein digestion of the first vector generates a first linear DNA fragment which 
lacks the first selectable marker gene but comprises the DNA sequence of interest, 
wherein a second vector comprises a second selectable marker gene which is 
distinguishable from the first selectable marker gene and non-essential DNA 

1 5 sequences, optionally including a counterselectable gene, which non-essential DNA 
sequences are flanked by at least two restriction enzymes sites, wherein at least one 
of the flanking restriction enzyme sites in the second vector is for a third restriction 
enzyme which generates single-strand DNA overhangs which are complementary to 
the single-strand DNA overhangs generated by the first restriction enzyme, wherein 

20 at least one of the flanking restriction sites in the second vector is for a fourth 

restriction enzyme that which generates ends that are not complementary to the ends 
generated by the first or third restriction enzyme but can be Ugated to the ends 
generated by the second restriction enzyme, and wherein digestion of the second 
vector with the third and fourth restriction enzymes generates a second linear DNA 

25 fragment which lacks the non-essential DNA sequences but comprises the second 
selectable marker gene, wherein the second linear DNA fragment is flanked by ends 
which permit the oriented joining of the first linear DNA fragment to the second 
linear DNA fragment. 
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29. The method of claim 28 wherein the second restriction enzyme generates 
blimt ends and the first linear DNA fragment is flanked by a first single-strand DNA 
overhang and a blunt end. 

5 30. The method of claim 28 wherein the first and third restriction enzymes are 
not the same. 

3 1 . The method of claim 28 wherein the second and fourth restriction enzymes 
are not the same. 

10 

32. The method of claim 28 wherein the second and fourth restriction enzymes 
generate blunt ends. 

33. The method of claim 28 wherein one of the restriction enzymes is a class IIS 
15 restriction enzyme. 

34. The method of claim 33 wherein the class US restriction enzyme is AccBll, 
Acein, AclWl, Adel, Ahdl, Alw26l, Alwl, AlwNl, ApaBI, AspEl, Aspl, AsuiJPI, 
Bbsl, Bbvl, Bbvll, Bce^Zl, Bcefl, BcNl, Bfil, Bgll, Binl, Bmrl, Bpil, Bpml, BpuAl, 

20 Bsal, Bse3Dl, BseAl, BseGl, BseU, Bse^, Bsgl, BsH, BsmAl, BsmBl, BsmFl, 
BspMl, BsrDl, Bstlll, BstAPl, BstFSl, BstXl, Bsu61, Dralll, Drdl, DseDl, 
EamllOAI, EamllOSI, Earl, EchHKI, Eco3U, Eco57l, EcoNl, 113961, Esp3l, Fokl, 
Paul, Gsul, Hgal, Hphl, MboU, MsiYl, Mwol, NruGl, PfMI, PfWl, Plel, Sfam, 
TspRl, Ksp6321, Mmel, RleM, Sapl, Sfil, Taqll, Tthll II, Tthl 1 III, Van91l, Xagl, 

25 Xcml, or a restriction enzyme which has the same recognition site as AccBll, Acelll, 
AclWl, Adel, Ahdl, Alw26l, Alwl, AlwNl, ApaBI, AspEl, Aspl, ^^mHPI, Bbsl, Bbvl, 
Bbvll, BceS3l, Bcefl, BciVl, Bfil, BgR, Binl, Bmrl, Bpil, Bpml, BpuAl, Bsal, 
Bse3Dl, Bse4l, BseGl, BseLl, BseSl, Bsgl, Bsll, BsmAl, BsmBl, BsmFl, BspMl, 
BsrDl, Bstlll, BstAPl, BstFSl, BstXl, Bsu6l, DraUl, Drdl, DseDl, EamllOAl, 

30 Eaml 1051, Earl, EchHKI, Eco311, EcoSll, Ecom, il3961, Esp31, Fokl, Paul, Gsul, 
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Hgal, Hphl, AfboU, MsiYl, Mwol, NruGl, PJMl, PJIFI, Plel, SfaNl, TspRl, 
Ksp632I, Mmel, RleAI, Sapl, Sfil, Taqll, Tth\ 1 II, Tth\ 1 IH, Van9\l, Xagl, Xcml. 

35. The method of claim 33 wherein one of the restriction enzymes is Aval, 

5 AmaS7l, Bcol, BsoBl, EcoSSl, Avail, EcoAll, BmeXZl, HgiEl, Sinl, Banl, AccBXl, 
BshNl, Eco64l, Bfml, BstS,m, Sfcl, BpulOl, BsaMl, BscCl, Bsml, A/va 12691, 
Bshl2%5l, BsaOl, BsiEl, BstMCl, Bsell, BseNl, Bsrl, C/rlOl, Bsil, BssSl, BstlBl, 
BsiZl, AspS9l, CfrUl, Sau96l, BsplllOl, Blpl, Bpul 1021, Ce/II, Bst4Cl, BstDEl, 
Ddel, Cpol, Cspl, RsrU, Dsal, 55/DSI, EcolAl, BanU, EcoT3Sl, FriOl, Hgim, 

10 £col30I, Styl, BssTll, EcoTUl, Erhl, Espl, Blpl, BpuUOll, BsplllOl, CelE, 
HgiAl, BsiHKAl, Alwlll, AspHl, BbvUl, Hinfl, PspVVl, PpuMl, PspSll, SanDl, 
Sdul, Bspl2S6l, Bmyl, Seel, BsaJl, BseDl, Sfcl, Bfml, BstSFl, Smll, or a restriction 
enzyme which has the same recognition site as Aval, AmaSll, Bcol, BsoBl, EcoSSl, 
AvaE, EcoAll, BmelSl, HgiEl, Sinl, Banl, AccBll, BshNl, Eco64l, Bfml, BstS¥l, 

15 Sfcl, BpulOl, BsaMl, BscCl, Bsml, Mval269I, Bshl2S5l, BsaOl, BsiEl, BstMCl, 
Bsell, BseNl, Bsrl, CfrlOl, Bsil, BssSl, BstlBl, BsiZl, AspS9l, CfrUl, Sau96l, 
BspinOl, Blpl, Bpul 1021, Ce/n, BstACl, BstDEl, Ddel, Cpol, Cspl, RsrU, Dsal, 
BstDSl, Eco2Al, BanU, EcoT3Sl, FriOl, HgilU, EcoUOl, Styl, BssTll, EcoTlAl, 
Erhl, Espl, Blpl, BpuUOll, BsplllOl, Celll, HgiAl, BsiHKAl, Alwlll, AspYll, 

10 Bbvlll, HinU, Psp??l, PpuMl, PspSll, SanDl, Sdul, Bspll86l, Bmyl, Seel, BsaJl, 
BseDl, Sfcl, Bfml, BstSEl, Smll. 

36. The method of claim 1 or 28 wherein one of the restriction enzymes is Aarl, 
Ascl, BbrCl, Cspl, Dral, Fsel, Notl, Nrul, Pad, Pmel, Pvul, Sapl, Sdal, Sfil, Sgfl, 

IS Spll, Srfl, Swal, or a restriction enzyme that has the same recognition site as Aarl, 
Ascl, BbrCl, Cspl, Dral, Fsel, Notl, Nrul, Pad, Pmel, Pvul, Sapl, Sdal, Sfil, Sgfl, 
Spll, Srfl, Swal. 

37. A method of inducing expression of a DNA sequence of interest in a host 
30 cell, comprising contacting a recombinant host cell which is deficient in rhanmose 

catabolism, and has a recombinant DNA molecule comprising a rhamnose-inducible 
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promoter operably linked to an open reading frame for a heterologous RNA 
polymerase, with rhamnose and an expression vector comprising a promoter for the 
heterologous RNA polymerase operably linked to a DNA sequence of interest. 

5 38. The method of claim 37 wherein the DNA sequence of interest is flanked by 
two restriction enzyme sites, wherein one of the flanking restriction enzyme sites is 
for a first restriction enzyme which has infrequent restriction sites in cDNAs or 
open reading frames from at least one species and generates single-strand DNA 
overhangs, and wherein another flanking restriction enzyme site is for a second 
10 restriction enzyme which has infrequent restriction sites in cDNAs or open reading 
frames from at least one species and generates ends that are not complementary to 
the overhangs generated by the first restriction enzyme. 

39. A method comprising introducing a vector comprising a nucleic acid 

1 5 fragment encoding a bamase which lacks a secretory domain into a recombinant 
host cell which expresses barstar from a promoter which is consitutively expressed 
in prokaryotic cells. 

40. A method comprising introducing the vector system of claim 14 into a host 
20 cell, wherein the second vector comprises a counterselectable gene comprising a 

nucleic acid fragment encoding a bamase which lacks a secretory domain. 

41 . A vector comprising an open reading frame 3 ' to a DNA fragment of no 
more than 30 base pairs, which DNA fragment comprises a ribosome binding site, a 

25 Sgfl recognition site, and a sequence which, when present in mRNA, enhances the 
binding of the mRNA to the small subunit of a eukaryotic ribosome. 

42. The vector of claim 41 wherein the DNA fragment includes 
AAGGAGCGATCGCX,ATGX2 (SEQ ID NO:l), and wherein Xi and Xa are 

30 individually an A, T, G or C. 
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43. A vector comprising a Sgfl recognition site, a sequence which comprises 
ATG and which sequence, when present in mRNA, enhances the binding of the 
mRNA to the small subunit of a eukaryotic ribosome, and an open reading frame 
which begins at the ATG in the sequence. 

5 

44. A vector comprising a Sg/l recognition site 5' to a recognition site for a first 
restriction enzyme which generates blunt ends, which vector, once digested with 
Sgfl and the first restriction enzyme and ligated to a DNA fragment comprising an 
open reading frame flanked by an end generated by a second restriction enzyme that 

10 generates a 3* TA overhang and an end generated by a third restriction enzyme 

which has infrequent restriction sites in cDNAs or open reading frames from at least 
one species and generates blimt ends, yields a recombinant vector comprising the 
open reading frame. 

15 45. The vector of claim 44 wherein the first and third restriction enzymes are the 
same. 

46. The vector of claim 44 wherein the first and third restriction enzymes are 
different. 

20 

47. The vector of claim 44 wherein the first restriction enzyme is Pmely EcoRY 
or Ban. 

48. The vector of claim 44 wherein the first restriction enzyme is Pmel, Dral, 
25 EsaBC3l, Hindlll, Hpal, ScH or Swal. 

49. The vector of claim 44 wherein the first restriction enzyme is Alu\ Ball, 
BfrBl, BsaM, BsaBl, BsrBl, Btrl, CacZl, Cdil, CvJI, Cv/RI, EcoAim, EcolSl, 
^coICRI, EcoRV, FnuDll Fsp Al, Hael, HaelU, HpySl, Lpnl, Mlyl, MsH, Mstl, 

30 Nael, NallV, Nrul NspBll Olil, PmaCl, Pmel PshAl Psil PvwII, Rsal, Seal, 
Smal, SnaBl, Srjl, Sspl, SspDSl, Stul, Xcal, Xmnl, or ZraL 
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50. The vector of claim 44 wherein the restriction enzyme that generates a 3' TA 
overhang is Sgfl. 

5 51. The vector of claim 44 which further comprises an open reading frame 
which includes the Sg/l site. 

52. The vector of claim 44 which comprises a ribosome binding site 5' to the 
nucleotide cleaved by Sgfl, 

10 

53. The vector of claim 44 wherein ligation generates the following sequence in 
the recombinant vector AAGGAGCGATCGCYATG or 

X1X2X3GCGATCGCCATG, wherein X1-X3, X2X3G or X3GC is a codon which is 
not a stop codon, and wherein Y is A, T, G or C. 

15 

54. The vector of claim 44 wherein ligation generates the following sequence in 
the recombinant vector X1X2X3GTTTY1 Y2, wherein X1X2X3 is a codon in an open 
reading frame which is not a stop codon and Yi and Y2 each =A, Yi = A and Y2 = G 
or Yi = G and Y2 = A. 

20 

55. The vector of claim 44 wherein ligation generates the following sequence in 
the recombinant vector X1X2X3GTTTY1 Y2, wherein X1X2X3, X2X3G or X3GT is a 
codon in an open reading frame which is not a stop codon and Yi is not A when Y2 
is A or G, or Yi is not G when Y2 is A. 

25 

56. A vector comprising a first open reading frame which includes a Sgfl 
recognition site and a recognition site which is not in the open reading frame for a 
restriction enzyme that has infrequent restriction sites in cDNAs or open reading 
frames from at least one species and generates blunt ends, which vector, once 

30 digested with Sgfl and the restriction enzyme which has infrequent restriction sites 
in cDNAs or open reading frames from at least one species and generates blunt 
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ends, and ligated to a DNA fragment comprising a second open reading flanked by 
a single-strand 3* TA DNA overhang and a blunt end, yields a recombinant vector 
comprising a third open reading frame comprising the first and second open reading 
frames, which third open reading frame encodes a fusion peptide or protein. 

5 

57. A vector comprising a ribosome binding site which optionally overlaps by 
one nucleotide with a Sgfi recognition site and a recognition site which is not in the 
open reading frame for a restriction enzyme that has infrequent restriction sites in 
cDNAs or open reading frames from at least one species and generates blunt ends, 

10 which vector, once digested with Sgfi and the restriction enzyme that has infrequent 

restriction sites in cDNAs or open reading frames from at least one species and 

generates blunt ends, and ligated to a DNA fragment comprising an open reading 

frame encoding a peptide or polypeptide flanked by 

5' CGCCATGXiYi 
15 3' TAGCGGTACX2Y2 

and a blunt end, yields a recombinant vector which encodes the peptide or 
polypeptide, wherein Xi is the first codon which is 3* to the st^ codon for the open 
reading frame, wherein X2 is the complement of Xi, wherein Yi is the remainder of 
20 the open reading frame, and wherein Y2 is the complement of Yi . 

58. The vector of claim 57 wherein Xi = GR1R2, wherein Ri or R2 = A, T, C or 
G. 

25 59. A vector comprising a first open reading frame which includes a Pmel 
recognition site and is flanked at the 5 ' end by a recognition site for a first 
restriction enzyme that generates complementary single-strand DNA overhangs, 
which vector, once digested with Pmel and the first restriction enzyme, and Ugated 
to a DNA fragment comprising a blunt end at the 5 ' end of a second open reading 

30 frame and an end generated by a second restriction enzyme which generates single- 
strand DNA overhangs which are complementary to the single-strand DNA 
overhangs generated by the first restriction enzyme, yields a recombinant vector 
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comprising a third open reading frame comprising the first and second open reading 
frames. 



60. The vector of claim 59 wherein the third open reading frame includes 

5 N1N2N3GTTTN4N5R, wherein N1N2N3 and TN4N5 are codons that do not code for a 
stop codon, and wherein R is one or more codons. 

6 1 . The vector of claim 59 wherein the blunt end of the DNA fragment is 
generated by a restriction enzyme other than Pmel. 

10 

62. The vector of claim 59 wherein the blunt end of the DNA fragment is 
generated by Pmel digestion. 

63. A vector comprising a first open reading frame which includes a Pmel 

1 5 recognition site and is flanked at the 5 ' end by site for a first restriction enzyme that 
generates complementary single-strand DNA overhangs, which vector, once 
digested with Pmel and the first restriction enzyme, and ligated to a DNA fragment 
comprising a blunt end and an end generated by a second restriction enzyme which 
generates single-strand DNA overhangs which are complementary to the single- 

20 strand DNA overhangs generated by the first restriction enzyme, yields a 

recombinant vector which includes N1N2N3GTTTN4N5, wherein N1N2N3GTTT is a 
sequence from the 3' end of the digested expression vector, wherein N1N2N3 do not 
code for a stop codon, and wherein N4 and N5 = A, or N4 = A and N5 = G or N4 = G 
and N5 = A. 

25 

64. The vector of claim 63 wherein the blunt end of the DNA fragment is 
generated by Pmel digestion. 

65. The vector of claim 63 wherein the blunt end of the DNA fragment is 
30 generated by a restriction enzyme other than Pmel. 
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66. A method for performing genetic analysis, comprising: 

a) populating a database of genetic data with a plurality of genetic 
records; 

b) querying the database of genetic data to identify a first subset of 
5 genetic records, wherein each record has at least one recognition site for one 

predetermined restriction enzyme or for restriction enzymes included in a set of 
predetermined restriction enzymes; and 

c) determining a set of statistics associated with the restriction enzyme 
recognition sites for at least a second subset of genetic records in the first subset. 

10 

67. The method of claim 66 wherein determining the set of statistics includes 
determining a number of genetic records including recognition sites for one 
predetermined restriction enzyme or for each of the predetermined restriction 
enzymes in the set. 

15 

68. The method of claim 66 wherein determining the set of statistics includes 
determining a number of occurrences of at least one site for the one predetermined 
restriction enzyme or for the predetermined restriction enzymes in a genetic record 
in the second subset. 

20 

69. The method of claim 66 wherein the genetic records comprise nucleic acid 
sequences. 

70. The method of claim 66 further comprising filtering the subset of genetic 
25 records to include or exclude genetic records having one or more selected 

characteristics. 



71 . The method of claim 66 further comprising filtering the subset of genetic 
records to exclude genetic records having a size greater than a predetermined value. 

30 
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72. The method of claim 71 wherein the predeteraiined value is 21000 
characters, 

73. The method of claim 66 further comprising determining the sequence of 
5 specific bases which are present as ambiguous bases within a recognition site or 

which are present between a recognition site for a restriction enzyme and the 
position at which the restriction enzyme cleaves DNA containing the recognition 
site. 

10 74. The method of claim 66 wherein at least one of the restriction enzymes has a 
6 bp, 7 bp or 8 bp recognition site. 

75. The method of claim 66 wherein at least one of the restriction enzymes is a 
hapaxoterministic restriction enzyme. 

76. A computerized system for genetic analysis, comprising: 
a database of genetic data; 
a processor; 

a set of one or more programs executed by the processor causing the 
processor to; 

query the database of genetic data to identify a first subset of genetic 
records, wherein each record has at least one recognition site for one 
predetermined restriction enzyme or for restriction enzymes included in a set 
of predetermined restriction enzymes, and; 

determine a set of statistics associated with the restriction enzyme 
recognition sites for at least a second subset of genetic records in the first 
subset. 

77. A recombinant vector prepared by digesting a vector comprising a Sgfl 

30 recognition site 5* to a recognition site for a first restriction enzyme which generates 
blunt ends, with Sgfl and the first restriction enzyme and ligating the digested vector 
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to a DNA fragment comprising an open reading frame flanked by an end generated 
by a second restriction enzyme that generates a 3' TA overhang and an end 
generated by a third restriction enzyme which has infrequent restriction sites in 
cDNAs or open reading frames from at least one species and generates blunt ends. 

5 

78. A support comprising a plurality of recombinant vectors, one or more of 
which comprise a different open reading frame, wherein each recombinant vector is 
prepared by digesting a vector comprising a Sgfl recognition site 5* to a recognition 
site for a first restriction enzyme which generates blunt ends, with Sg/l and the first 

10 restriction enzyme and ligating the digested vector to a DNA fragment comprising 
an open reading frame flanked by an end generated by a second restriction enzyme 
that generates a 3' TA overhang and an end generated by a third restriction enzyme 
which has infrequent restriction sites in cDNAs or open reading frames from at least 
one species and generates blunt ends. 

15 

79. The support of claim 78 which a multi-well plate, the wells of which 
optionally each comprise a different recombinant vector. 

80. The support of claim 78 wherein the different open reading frames include 
20 open reading frames having nucleotide substitutions of a selected open reading 

frame which different open reading frames are prepared by mutatgenesis of the 
selected open reading frame. 
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